Incremental Feature Selection and l1 Regularization for Relaxed Maximum-Entropy Modeling
نویسندگان
چکیده
We present an approach to bounded constraintrelaxation for entropy maximization that corresponds to using a double-exponential prior or `1 regularizer in likelihood maximization for log-linear models. We show that a combined incremental feature selection and regularization method can be established for maximum entropy modeling by a natural incorporation of the regularizer into gradientbased feature selection, following Perkins et al. (2003). This provides an efficient alternative to standard `1 regularization on the full feature set, and a mathematical justification for thresholding techniques used in likelihood-based feature selection. Also, we motivate an extension to n-best feature selection for linguistic features sets with moderate redundancy, and present experimental results showing its advantage over `0, 1-best `1, `2 regularization and over standard incremental feature selection for the task of maximum-entropy parsing.1
منابع مشابه
Department of Statistics -statistics Seminar – Spring 2011 Title: on Optimal Estimation of a Nonsmooth Functional Title: Entire Relaxation Path for Maximum Entropy Models
In this talk I will discuss some recent work on optimal estimation of nonsmooth functionals. These problems exhibit some interesting features that are significantly different from those that occur in estimating conventional smooth functionals. This is a setting where standard techniques fail. I will discuss a newly developed general minimax lower bound technique that is based on testing two fuz...
متن کاملMaximum Entropy Density Estimation and Modeling Geographic Distributions of Species
Maximum entropy (maxent) approach, formally equivalent to maximum likelihood, is a widely used density-estimation method. When input datasets are small, maxent is likely to overfit. Overfitting can be eliminated by various smoothing techniques, such as regularization and constraint relaxation, but theory explaining their properties is often missing or needs to be derived for each case separatel...
متن کاملA Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing
This paper presents a comparative study of five parameter estimation algorithms on four NLP tasks. Three of the five algorithms are well-known in the computational linguistics community: Maximum Entropy (ME) estimation with L2 regularization, the Averaged Perceptron (AP), and Boosting. We also investigate ME estimation with L1 regularization using a novel optimization algorithm, and BLasso, whi...
متن کاملFeature Selection and Dualities in Maximum Entropy Discrimination
Incorporating feature selection into a classi cation or regression method often carries a number of advantages. In this paper we formalize feature selection speci cally from a discriminative perspective of improving classi cation/regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED...
متن کاملOn Bayesian Inference, Maximum Entropy and Support Vector Machines Methods
The analysis of discrimination, feature and model selection conduct to the discussion of the relationships between Support Vector Machine (SVM), Bayesian and Maximum Entropy (MaxEnt) formalisms. MaxEnt discrimination can be seen as a particular case of Bayesian inference, which at its turn can be seen as a regularization approach applicable to SVM. Probability measures can be attached to each f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004